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Abstract 

Using a method of entropic analysis of time series we establish the correlation 
between heartbeat long-range memory and mortality risk in patients with 
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congestive heart failure. 
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This paper deals with an advanced aspect of statistical mechanics whose recent demon- 
stration [1] is proven here to afford an efficient criterion to deal with the mortality risk of 
patients with Congestive Heart Failure (CHF). We define the condition corresponding to the 
mean position of healthy subjects in a plane, called, as we shall see, physiological plane. We 
show that the distance of a CHF patient from this optimal condition correlates with mor- 
tality, and that the CHF patients very close to it survived. The importance of this criterion 
does not need to be defended. We think that the demonstration of this important property is 
of interest for both researchers at the frontier of statistical mechanics and cardiologists. The 
first step of our approach is based on observing a sequence of numbers {Tj}, denoting the 
distance between two nearest neighbor pulses of a given electrocardiogram (ECG). One way 
to denote the time between two pulses is to measure the time elapsed betwen two adjacent 
R waves in the recorded electrical signal. This time is referred to as the RR interval, and 
the resulting sequence is consequently denoted as RR sequence. 

We considered for this study 13 male CHF patients, from a study base of 320 subjects, 
who experienced cardiac death during a follow-up of 26 months (average 19 months, median 
22 months). Inclusion criteria were absence of pulmonary or neurological disease, absence 
of acute myocardial infarction or cardiac surgery within the previous 6 months, absence of 
any other disease limiting survival, stable therapy for at least 2 weeks and good quality 24- 
hour Holter recordings, with an ectopy rate less than 5%. A comparable number of control 
subjects (16 patients), matching for age, sex, NYHA class (a functional and therapeutic 
classification for prescription of physical activity for cardiac patients) and etiology, was then 
selected. These latter patients did not experience cardiac death after follow-up. All patients 
had a 24-hour Holter recording at baseline, together with standard functional evaluation 
including measurement of left ventricular ejection fraction (LVEF), peak V02 and Sodium 
(Na). Finally, RR series for 10 healthy subjects were taken from the NOnLinear Time Series 
AnaLysIS (NOLTISALIS) archive. This latter data set is the result of the collaboration of 
several interdisciplinary Italian research centers. Experienced analysts edited these Holter 
recordings, manually correcting interbeat times due to ectopic beats. This editing work 
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yielded the RR sequence, from which we generated the time series {Tj}. 

The meaning of a given value Tj, as earlier stated, is the time distance between the i — th 
and the (i + 1) — th pulse. The sequence {Tj} can be studied as a new time-series, with i 
playing the role of "time". Moreover, the value Tj, expressed as a function of i with i » 1, 
can be thought of as a function T(t), namely, as a function of a continuous time variable t. 

The real curve T{t) looks erratic and disordered. However, our method of analysis 
shows that it is quite different from a random process. For the reader to get an intuitive 
understanding of this attractive but perplexing conclusion, let us describe first an ideal 
model, with extended memory (EM), for the time evolution of T{t). First of all, we assume 
that for a given time Tgm, the curve T{t) keeps a given slope a, then it abruptly gets a new 
slope, a', for an interval of time r^^, after which a new abrupt transition to a new slope a" 
takes place, for a time r^'^, and so on. It is evident that the resulting T{t) has the form of a 
zig-zag curve. We shall refer to the individual straight line intervals of this curve as laminar 
regions. Any laminar region is associated with its own Tem- Then we introduce the extended 
memory property. This is done by assuming for the waiting time distribution V'(rem) the 
following inverse power law form 

\ (,, n [(^em)(^^- ... 
Wijem) ^{y- 1)77—^7- TTT^n^' 

with 2 < i/ < 3, where {rem) is the average waiting time. This means that if the EM model 
is directly observable, we can derive from T{t) the sequence {Te^(j)}, where the discrete 
index j denotes the time order of a given laminar region. 

It is interesting to notice that this dynamic process is essentially equivalent to the strong 
anomalous diffusion model recently proposed by the authors of Refs. [2,3] to explain the 
effects of a ballistic mechanism in the Rayleigh-Benard convection. This model, in turn, is 
nothing but a generalization of the dynamic approach to Levy statistics proposed years ago 
by the authors of Ref. [5]. In fact, the model of Ref. [5] is recovered from the model we 
are adopting here, by assuming that the slope a has only two values, of equal intensity and 
opposite sign [6] . With this equivalence in mind, we adopt the specific walking prescription 



proposed by Ref. [4]. We consider a given time t and we evaluate the number of laminar 
regions that have been completed within this time. Let us call this number n. Then the 
trajectory is 

y{t) = nW. (2) 

This means that the random walker makes a jump ahead, by the same quantity W, at the 
end of any laminar region. The quantity W is arbitrary and in the following we assume 
W — 1. Then, according again to the prescriptions of Ref. [4], we create the trajectories 

x{l)^y{t + l)-y{t). (3) 

It is evident that we can move the index t from t — to L — I, where L is the total time 
length of the sequence under study. This makes it possible for us to evaluate the probability 
distribution p{x,l), which at / = is a delta of Dirac located at a; = 0, broadening upon 
time increase. Note that an important step of our approach rests on the determination of 
the Shannon entropy of this distribution, namely 

roo 

S{1) = - dxp{x, I) log p{x, I). (4) 

Joo 

This is the reason why this technique of analysis has been termed Diffusion Entropy (DE) 
method [7]. According to the theory of Ref. [4] we immediately reach the conclusion that 
p{x, I) fulfils the scaling property 

p{x,l) ^ {l/l')F{x/l'), (5) 

with S being the scaling parameter, which is related to i/ by 

S=l/{u-l). (6) 

In this specific case, it is straightforward to prove, by plugging p(a;, /) of Eq. (5) into Eq.(4), 
that 

S{l)^A + 5\og{l). (7) 



This means that the DE should yield for S{t), expressed in a hnear-log plot, a straight line, 
whose slope is the searched value of the scahng parameter 5. 

This way of proceeding is impossible in practice, because the real T{t) curve significantly 
departs from the zig-zag form of the EM model. We make the conjecture that the departure 
from this ideal condition is caused by the fact that the actual signal T{t) is the superposition 
of the EM model signal and a much stronger, but totally random component. This makes 
it impossible for us to directly evaluate ■0('^em)- We proceed as follows. Let us represent the 
T(t) time evolution in a (T, t) plane, with the ordinate referring to T and the abscissa to 
t. The ordinate axis is divided into cells of equal size, called s. This means that we divide 
the {T, t) plane into strips of size s, and that in the ideal case of constant frequency the 
trajectory T{t) would move forever remaining within the same strip. Actually, transitions 
from one strip to the other occur frequently. We call these transitions markers. These 
markers might have quite different origins. The majority of the markers are determined by 
the short-time noise. Many other markers correspond to the hidden laminar regions of the 
underlying EM model; we call these markers "pseudoevents" . As in Ref . [1] , we indicate with 
the term pseudoevent a marker that does not correspond to an unpredictable transition, but 
it is a consequence of the division of the (T, t) plans into strips. Only a very small number of 
markers coincide with the turning points of the EM signal, or are sufficiently close to them. 
We call these markers real events. 

Now we have to explain why the DE is sensitive only to the real events. The time distance 
between the j-th and the (j + l)-th marker, defines the time Texp{i) of the experimental 
sequence {Texp{i)}- It is evident that even in the case where T{t) were an exact realization 
of the EM model, the waiting time distribution ■0(t) might turn out to be totally different 
from ip{Tem)- In fact, if s is very small the same laminar region is divided into many smaller 
time intervals, with the same length. These are pseudoevents. It is evident that these 
pseudovents do not contribute to the spreading of the distribution p{x, I), and consequently 
do not contribute to the entropy increase. This means that the DE method is insensitive to 
pseudoevents. 
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What about the short-time random events? They, in principle, contribute the entropy 
increase, and consequently could affect the determination of the crucial parameter 5. We 
can show that they do not. Notice that in the presence of the short-time random component 
the actual signal Texp{t) is given by 



with a <S 1 and b close to 1. The first contribution corresponds to the EM model, and 
the second is generated by short-time random fluctuations. The correlation function of this 
signal is 



where Cgtit) is an inverse power law relaxation and Crandomi't) a relaxation function decaying 
to zero in one time step. If we take, without loss of generality (r^t^) = {R"^) = (Texp^), 
implying + b'^ — 1, then we have p — a? and {1 — p) — b"^. How should parameter p be 
evaluated? This is easily done monitoring the experimental correlation function at the first 
time step. According to the fact that Crandomi't) decays to zero in one step, while Cst{t) 
is much slower, we immediately obtain p — Cexp(l)- Its evaluation is not independent of 
that of the other parameter, s. This is so because the experimental evaluation of Texp{t) is 
dependent on s. 

However, while p is s-dependent, the parameter S is not. The DE method has the 
surprising capability of yielding a value for 5 that is independent of s, even in the case when 
a strong short-time random component is present, and not only when the ideal EM model 
applies. How can it be so? This is so because the EM model component yields superdiffusion, 
while the random component generates ordinary diffusion. In the asymptotic limit of very 
large values of /, the superdiffusion component, which is faster than the ordinary diffusion, 
becomes predominant, and the DE method detects again the correct scaling S even in the 
case where p <S 1. 

The parameter p is very important, since it defines the statistical weight of the EM 
component present in the experimental signal T{t). However, its dependence on s makes its 



Texpit) = dTstit) + bR{t), 



(8) 



Cexp{t) — pCst{t) + (1 — p) Crandomi't), 



(9) 
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use questionable. However, we see that all curves p{s) share the same properties of getting 
small values for both small and large values of s, with a clearly defined maximum in between, 
which is referred to by us as tt. This maximum is a property independent of s. Infact, we 
note that almost all the healthy patients get their maximum at 30 ms, while most of the 
CHF patients get theirs at 20 ms. The typical dependence of p on s is illustrated in Fig. 1, 
where we can see both the case of 5 healthy subjects with tt located at 30 ms, and 5 CHF 
patients, with tt located at 20 ms. We think that the parameter tt is a reliable measure 
of the EM component. Consequently, we decided to represent the physiological conditions 
of all patients, healthy and CHF alike, in the {S, tt) that we call physiological plane. The 
criteria adopted to define the physiological plane make the resulting "phase-space" diagram 
independent of the coarse-graining parameter s, and the location of any patient in the 
physiological plane is an objective property independent of the coarse graining parameter 
s. A mere visual inspection shows that the healthy and CHF patients do not mix but in 
relatively small region. The overlap region is so small as to make it possible for us to claim 
that healthy and CHF patients belong to two distinct regions of the physiological plane [8] . 

The division of the CHF patients into two groups, dead and ahve, make the result of our 
analysis still more remarkable. To show this important property we proceed as follows. First 
of all, we define the center of gravity of the healthy patient, denoted with a white square in 
Fig. 2. We call this point optimal condition. Then, for any CHF patient we measure the 
Euclidean distance from the optimal condition, thereby making it possible for us to rank the 
CHF patients in order, according to this distance. In other words, the first CHF patient is 
the one with minimum distance from the optimal condition. Then we observe the remaining 
patients, and we rank as second the one with minimum distance from the optimal condition, 
and so on. We find that the first 7 patients are alive. The eighth patient is dead, and from 
now on the patients are either ahve or dead. This suggests that the closer the patient to the 
optimal condition the higher the survival probability. 

To support in a more rigorous way this important property, we apply the Mann- Whitney 
method [9]. This is a non-parametric test, namely it does not rest on Gaussian distributions 
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for the data. Let us count the number of patients dead, N^ead, and the number of patients 
ahve, Naiive- The data of Fig. 2 refer to a case where A^dead = 13 and NaUve — 16. Let us 
consider the group with the smaller number of individuals. This means the group of dead 
patients. Then, let us evaluate the sum of the ranks of this group and let us call it ^exp- 
We note that /Xgxp = 246. Under the hypothesis of no correlation between our parameters 
and the death probability, this resulting sum has a probability distribution that, for more 
than 8 elements is expected [9] to be approximately a Gaussian with mean and standard 
deviation a given by 



Prom the data of Fig. 2 we obtain Udead — 195 and a — 22.8, while, as earher noticed, 
A*exp = 246. This means that the distance of \idead from /^exp is 2.37(7. In practice, we are 
allowed to rule out the hypothesis of no correlation between the death of CHF patients and 
their distance from the optimal condition: the probability for the distribution of dead and 
alive patients of Fig. 2 to be fortuitous, is less than 3%. 

Moreover, it is important to remark that the alive patients corresponding to points in the 
physiological plane far from the optimal conditions, either had a serious pathology, being 
classified as NYHA class III (severe physical limitations, they are confortable only at rest) 
and therefore required a heart transplant anyway, or had a very short follow-up time (less 
than 6 months). Only six alive patients did not belong to either of the above conditions, 
but it is remarkable that all six of them occupy positions which overlap with the zone of the 
healthy subjects. Unfortunately, the small number of sequences available to us at this time 
does not allow us to calculate the survival curves, or attempt any further conclusion. 

It is convenient to compare the result of this paper to the literature in the field of 
nonlinear or fractal analysis of chardiological data. The most recent examples are those of 
Refs. [1,10,11]. Although based on different perspectives, multifractality [10,11] and memory 
beyond memory [1] as a sign of healthy physiological condition, neither of these two groups 
could address the ambitious step of granting physicians a reliable criterion to make crucial 




(10) 
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decisions about the CHF patients. These findings, however suggest that 5 and tt indexes, 
and especially the distance from the optimum condition in the physiological plane, should 
be considered for inclusion in the candidate predictors' list of future large-scale prospective 
studies for risk stratification of CHF patients. 
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FIG. 1. p as a function of s for: a) a group of healthy subjects; b) a group of CHF patients. 
For clarity, we have plotted only 5 subjects for each group. 
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FIG. 2. Positions of healthy (circles) and CHF (diamonds) subjects in the physiological plane. 
The white diamonds correspond to patients alive after the end of the experiment, while the black 
ones to patients who were either dead or urgently transplanted. The white square (optimal condi- 
tion, see text) represents the average position of healthy subjects in the plot. 
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